Enable Client Side Access Logs for SD#2955
Conversation
|
Do we want a way to control whether or not we want both server and client access logs in config? |
| inline bool StackdriverRootContext::enableServerAccessLog() { | ||
| return !config_.disable_server_access_logging() && !isOutbound(); | ||
| inline bool StackdriverRootContext::enableAccessLog() { | ||
| return !config_.disable_server_access_logging(); |
There was a problem hiding this comment.
I feel like we want more config around enabling client access logging and, relatedly, we need to assess whether or not disable_server_access_logging should be renamed.
There was a problem hiding this comment.
should we just add disable_client_access_logging ? changing disable_server_access_logging might cause compatibility issues?
There was a problem hiding this comment.
Added disable_client_access_logging
extensions/stackdriver/log/logger.cc
Outdated
| (*label_map)["destination_name"] = | ||
| flatbuffers::GetString(local_node_info.name()); | ||
| void Logger::fillDestinationLabels( | ||
| const ::Wasm::Common::FlatNode& node_info, |
There was a problem hiding this comment.
nit: destination_node_info
extensions/stackdriver/log/logger.cc
Outdated
| } | ||
|
|
||
| void Logger::fillSourceLabels( | ||
| const ::Wasm::Common::FlatNode& node_info, |
| destination_version: v1 | ||
| protocol: http | ||
| log_sampled: "true" | ||
| upstream_cluster: "outbound|9080|http|server.default.svc.cluster.local" |
There was a problem hiding this comment.
why does this entry not mention any trace related information?
There was a problem hiding this comment.
We are not sending any request with trace headers... will add a test for that in a separate pr
@douglas-reid : I think it we be good to have a separate configs for both? @bianpengyuan , @mandarjog : wdyt? |
|
Per discussion
or just one bool? |
Need 2 in my mind..
|
a1b0092 to
66f2075
Compare
|
@bianpengyuan, @douglas-reid , @kyessenov, @mandarjog : This is ready for review.. |
| bool disable_client_access_logging = 10; | ||
|
|
||
| // Optional. Controls whether to export client access log if MX with server | ||
| // fails and there was an error in request/connection. |
There was a problem hiding this comment.
I think we need to document what this means in the face of disable_client_access_logging.
It naively seems like this is ignored if disable_client_access_logging is true. And if disable_client_access_logging is false, this seems like it then would filter out error logs if false.
But I don't believe that is actually how it works.
Can we clarify?
| google.protobuf.BoolValue enable_log_compression = 9; | ||
|
|
||
| // Optional. Controls whether to export client access log. | ||
| bool disable_client_access_logging = 10; |
There was a problem hiding this comment.
Thinking about this more, I feel like maybe we should move to a Logging enum (partially because I don't like disable booleans, and partially because we now have multiple options).
Maybe something like:
enum AccessLogging {
NONE = 0;
SERVER_ONLY = 1;
SERVER_AND_CLIENT_ERRORS = 2;
SERVER_AND_CLIENT = 3;
CLIENT_ERRORS_ONLY = 4;
CLIENT_ONLY = 5;
}
AccessLogging access_logging = 10;Thoughts?
There was a problem hiding this comment.
I like the idea as it saves from problems of multiple combinations of booleans... we would just have to match with configuring it in istio via values properly..
how about this?
`
enum AccessLogging {
NONE = 0;
FULL_SERVER_ONLY = 1;
FULL_SERVER_AND_CLIENT_ERRORS_ONLY = 2;
FULL_SERVER_AND_FULL_CLIENT = 3;
CLIENT_ERRORS_ONLY = 4;
FULL_CLIENT_ONLY = 5;
}
AccessLogging access_logging = 10;
`
just to clarify enum option 2 above?
There was a problem hiding this comment.
FULL seems to be confusing to me. It makes sense to me that SERVER_ONLY means log all request.
|
@bianpengyuan, @douglas-reid , @kyessenov, @mandarjog : This is ready for review.. |
| enum AccessLogging { | ||
| // No Logs. | ||
| None = 0; | ||
| // All logs only including both success and error logs. |
There was a problem hiding this comment.
| // All logs only including both success and error logs. | |
| // All logs including both success and error logs. |
|
|
||
| inline bool StackdriverRootContext::enableServerAccessLog() { | ||
| return !config_.disable_server_access_logging() && !isOutbound(); | ||
| return (!config_.disable_server_access_logging() || |
There was a problem hiding this comment.
Does logger need to distinguish enablement of server or client log? Since we will configure inbound and outbound filter differently, so enablement in logger should be direction agnostic?
There was a problem hiding this comment.
The direction will matter if suppose we want to enable only server logs.. but stackdriver filter will be added in both directions for metrics...
There was a problem hiding this comment.
But we are going to have different configurations for inbound and outbound filter, right? Each logger/root context should be either inbound or outbound, so in terms of enablement, I think it does not need to know about directions.
There was a problem hiding this comment.
Agreed.. that's why I didn't have it in my initial pr... currently it's needed for errors only case as we are enabling it for client only.. I will refactor this part once we make a decision on configs...
| FULL = 1; | ||
| // All error logs when no metadata is available of the peer. This is | ||
| // currently only available for outbound/client side logs. | ||
| ERR_ONLY_ON_NO_MX = 2; |
There was a problem hiding this comment.
I think this is an unnecessarily subtle condition, which relates too much to the implementation details. IMO we should offload log decision to the sample filter. Ideally log sample filter could support CEL and flexibly decide whether to log based on presence of any header including mx ones. Considering log sample filter does not support CEL for now, I think it is fine to log every errors, and we can always further optimize it by adding CEL support in sample filter in the future.
There was a problem hiding this comment.
If offloading to log sample filter is the agreed approach, do we need to make this an enum? Shall we keep it as boolean (or a BoolValue if disable field name is undesired) and rename it to something without server in it, like enable_access_log?
There was a problem hiding this comment.
Offloading to sample filter would involve change in sampling filter and this option was added to fasttrack client side logs.. so that we don't enable client side and server logs by default causing increase in logs cost.
I totally agree with offloading the decision to log to sampling filter rather than stackdriver filter. In case of http, we can do on presence of header and for tcp on presence of filter state..
There was a problem hiding this comment.
Judging whether to log based on presence of mx data seems to be the ultimate way of reducing log volume and not sure if that is really necessary for initial release. I think only logging errors (which log sample filter already supports) already seems to be a good budget choice. IMHO we should start with this simple option and avoid complexity. @mandarjog @kyessenov @douglas-reid wdyt?
There was a problem hiding this comment.
so: { NONE, ALL, ERR_ONLY } ? Seems fine to me...
There was a problem hiding this comment.
Updated based on offline conversation. Changed to Err_Only to log on errors.
extensions/stackdriver/log/logger.cc
Outdated
| log_entries_request->mutable_resource()->CopyFrom(monitored_resource); | ||
| } | ||
|
|
||
| void Logger::initializeInboundLogEntryRequest( |
There was a problem hiding this comment.
This function is basically the same as the following one and has quite some logic in it. Shall we combine them and parameterize it by direction?
extensions/stackdriver/log/logger.h
Outdated
| void fillDestinationLabels( | ||
| const ::Wasm::Common::FlatNode& node_info, | ||
| google::protobuf::Map<std::string, std::string>* label_map); | ||
|
|
||
| // Helper methods to fill source Labels. | ||
| void fillSourceLabels( | ||
| const ::Wasm::Common::FlatNode& node_info, | ||
| google::protobuf::Map<std::string, std::string>* label_map); | ||
|
|
||
| // Helper method to set monitored resource. | ||
| void setMonitoredResource( |
There was a problem hiding this comment.
Looks like these methods fillDestinationLabels, fillSourceLabels, setMonitoredResource, and GetLogEntryType do not need to be part of logger class? Should we move it out to an anonymous namespace in logger.cc?
There was a problem hiding this comment.
You know started with anonymous but somehow moved to logger.. thanks, changed back to anonymous!
GetLogEntryType can't be moved as it's accessing private member of Logger class
|
/test test_proxy |
| // Types of Access logs to export. | ||
| enum AccessLogging { | ||
| // No Logs. | ||
| None = 0; |
| FULL = 1; | ||
| // All error logs when no metadata is available of the peer. This is | ||
| // currently only available for outbound/client side logs. | ||
| ERR_ONLY_ON_NO_MX = 2; |
There was a problem hiding this comment.
so: { NONE, ALL, ERR_ONLY } ? Seems fine to me...
douglas-reid
left a comment
There was a problem hiding this comment.
My concerns have been addressed. Removing my block.
| FULL = 1; | ||
| // All error logs. This is | ||
| // currently only available for outbound/client side logs. | ||
| ERR_ONLY = 2; |
There was a problem hiding this comment.
super nit: slight preference for fully spelling this out ERRORS_ONLY.
| // All logs including both success and error logs. | ||
| FULL = 1; | ||
| // All error logs. This is | ||
| // currently only available for outbound/client side logs. |
There was a problem hiding this comment.
nit: can we add the detail from the discussion around response flags and codes here too?
Fix fmt Fix fmt Fix test Added config options and test for the same Fixed after rebase Fixed config and added another test case Run fmt Change from ERR_ONLY to ERR_ONLY_ON_NO_MX Fixed based on feedback Updated config Updated config Fixed based on feedback
071dec3 to
692e909
Compare
bianpengyuan
left a comment
There was a problem hiding this comment.
lgtm, only some minor comment
| // either by a timer or by request size limit. Returns false if there is no | ||
| // log entry to be exported. | ||
| bool flush(); | ||
| void flush(Logger::LogEntryType log_entry_type); |
There was a problem hiding this comment.
Should this method return boolean to follow the existing one?
There was a problem hiding this comment.
There is no status to return there...it's more of a helper method..
| log_entries_request->set_log_name("projects/" + project_id_ + "/logs/" + | ||
| log_name); | ||
|
|
||
| std::string resource_type = outbound ? Common::kPodMonitoredResource |
There was a problem hiding this comment.
nit: const std::string&
There was a problem hiding this comment.
Cant change to const ref as we assigning resource_type based on cluster name key below...
extensions/stackdriver/log/logger.cc
Outdated
| log_entries_request_map_[log_entry_type]->size = 0; | ||
| auto log_entries_request = | ||
| log_entries_request_map_[log_entry_type]->request.get(); | ||
| std::string log_name = outbound ? kClientAccessLogName : kServerAccessLogName; |
There was a problem hiding this comment.
nit: const std::string&
| !config_.disable_http_size_metrics()); | ||
| if (enableServerAccessLog() && shouldLogThisRequest(request_info)) { | ||
| if ((enableAllAccessLog() || | ||
| (enableClientAccessLogOnError() && |
There was a problem hiding this comment.
I still don't think we need to distinguish client or not. This can just be enableAccessLogOnError. Direction is already distinguished at filter configuration.
There was a problem hiding this comment.
We are saying that we enable access log on error on client side only... thus the distinction...
I think the code will work for server side too if we remove the distinction.. would be a good add to server side logging... @mandarjog, @douglas-reid wdyt?
There was a problem hiding this comment.
I think “logOnError” will suffice.
We will not set it in server side logs config for now, but will work just as well.
|
In response to a cherrypick label: #2955 failed to apply on top of branch "release-1.7": |
|
In response to a cherrypick label: new issue could not be created for failed cherrypick: status code 410 not one of [201], body: {"message":"Issues are disabled for this repo","documentation_url":"https://docs.github.com/v3/issues/"} |
What this PR does / why we need it:
Enable Client Side Access Logs for SD
Which issue this PR fixes (optional, in
fixes #<issue number>(, fixes #<issue_number>, ...)format, will close that issue when PR gets merged): fixes #Special notes for your reviewer:
Will need sampling for client side logs too.. will add in a follow-up PR.
Release note: